Compiler-Assisted Memory Encryption for Embedded Processors
نویسندگان
چکیده
A critical component in the design of secure processors is memory encryption which provides protection for the privacy of code and data stored in off-chip memory. The overhead of the decryption operation that must precede a load requiring an off-chip memory access, decryption being on the critical path, can significantly degrade performance. Recently hardware counterbased one-time pad encryption techniques [13, 16, 11] have been proposed to reduce this overhead. For high-end processors the performance impact of decryption has been successfully limited due to: presence of fairly large on-chip L1 and L2 caches that reduce off-chip accesses; and additional hardware support proposed in [16, 11] to reduce decryption latency. However, for lowto medium-end embedded processors the performance degradation is high because first they only support small (if any) on-chip L1 caches thus leading to significant off-chip accesses and second the hardware cost of decryption latency reduction solutions in [16, 11] is too high making them unattractive for embedded processors. In this paper we present a compilerassisted strategy that uses minimal hardware support to reduce the overhead of memory encryption in lowto medium-end embedded processors. In addition to the global counter used in [13], our technique uses additional counters. These counters, which are compiler controlled, are maintained using a small number of dedicated on-chip registers. Our experiments show that the proposed technique reduces average execution time overhead of memory encryption for low-end (medium-end) embedded processor with 0 KB (32 KB) L1 cache from 60% (13.1%), with single counter, to 12.5% (2.1%) by additionally using only 8 hardware counter-registers.
منابع مشابه
Hardware/software Techniques for Memory Power Optimizations in Embedded Processors
Power has become one of the primary design constraints in modern microprocessors. This is all the more true in the embedded domain where designers are being pushed to create faster processors that operate for long periods of time on a single battery. It is well known that the memory sub-system is responsible for a significant percentage of the overall power dissipation. For example, in the Stro...
متن کاملEfficient Memory Integrity Verification Schemes for Secure Processors
Single Chip Secure Processors have recently been proposed for variety of applications ranging from anti-piracy to trusted execution of distributed processes. Off-chip memory integrity verification and encryption are two fundamental tasks of a single-chip secure processor. Memory integrity verification is regarded as the main bottleneck in improving the performance of secure processors. Differen...
متن کاملA Compiler Integrated Assistance for Optimum Data Allocation in Banked Memory Embedded Processors
Bank switching in embedded processors having partitioned memory architecture results in code size as well as run time overhead. An algorithm and its application to assist the compiler in eliminating the redundant bank switching codes introduced and deciding the optimum data allocation to banked memory is presented in this work. A relation matrix formed for the memory bank state transition corre...
متن کاملA network flow approach to memory bandwidth utilization in embedded DSP core processors
This paper presents a network flow approach to solving the register binding and allocation problem for multiword memory access DSP processors. In recently announced DSP processors, sixteen bit instructions which simultaneously access four words from memory are supported. A polynomial-time network flow methodology is used to allocate multiword accesses, including constant data memory layout, whi...
متن کاملProcessor-Directed Cache Coherence Mechanism – A Performance Study
Cache coherent multiprocessor architecture is widely used in the recent multi-core systems, embedded systems and massively parallel processors. With the ever increasing performance gap between processor and memory, there is a requirement for an optimal cache coherence mechanism in a cache coherent multiprocessor. The conventional directory based cache coherence scheme used in large scale multip...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Trans. HiPEAC
دوره 2 شماره
صفحات -
تاریخ انتشار 2007